Skip to content

add Concat quantization#17448

Merged
luotao1 merged 3 commits intoPaddlePaddle:developfrom
sfraczek:concat-quantization
May 27, 2019
Merged

add Concat quantization#17448
luotao1 merged 3 commits intoPaddlePaddle:developfrom
sfraczek:concat-quantization

Conversation

@sfraczek
Copy link

@sfraczek sfraczek commented May 16, 2019

in files graph_pattern_detector.cc, graph_pattern_detector.h, cpu_quantize_pass.cc, cpu_quantize_pass.h, mkldnn_quantizer.cc, mkldnn_quantizer.cc:

  • added Concat quantization code

in file mkldnn_quantizer.cc:

  • handling of multiple tensors wired to single input
  • extended list of ops that do not modify the sign of values in tensor,
  • added setting type to unsigned after regular ReLU op

@sfraczek
Copy link
Author

I have to add use_quantizer flag to concat operator which because its missing in this PR

wojtuss
wojtuss previously approved these changes May 23, 2019
Sylwester Fraczek added 3 commits May 24, 2019 11:53
add unit test for quantizing concat
fix for wrong value when the input is not in map of calculated scales
add use_quantizer to concat_op.cc
add scale_algo rules for concat

test=develop
@luotao1 luotao1 merged commit 96845d2 into PaddlePaddle:develop May 27, 2019
@luotao1
Copy link
Contributor

luotao1 commented May 27, 2019

Which model uses this quantization, do you have some benchmark before and after this quantization?

@sfraczek
Copy link
Author

Googlenet and mobilenet-ssd benefit from this (but we don't have test for mobilenet-ssd yet).
GoogleNet is about 1.67x faster with concat quantized vs before (on my i9 for development with batch size 50 and 1000 iterations).

@wojtuss wojtuss added this to the v1.5 for Intel milestone May 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants